Multiagent Low-Dimensional Linear Bandits

نویسندگان

چکیده

We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector $\theta ^* \in \mathbb {R}^{d}$ . The information consists of finite collection low-dimensional subspaces, one which contains ^*$ In our setting, agents can collaborate to reduce regret sending recommendations across communication graph connecting them. present novel decentralized algorithm, where communicate subspace indices each other and agent plays projected variant LinUCB on the corresponding (low dimensional) subspace. By distributing search for optimal users learning in subspace, we show that per-agent finite-time is much smaller than case when do not communicate. finally complement these results through simulations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conservative Contextual Linear Bandits

Safety is a desirable property that can immensely increase the applicability of learning algorithms in real-world decision-making problems. It is much easier for a company to deploy an algorithm that is safe, i.e., guaranteed to perform at least as well as a baseline. In this paper, we study the issue of safety in contextual linear bandits that have application in many different fields includin...

متن کامل

Misspecified Linear Bandits

We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potentia...

متن کامل

Structured Stochastic Linear Bandits

The stochastic linear bandit problem proceeds in rounds where at each round the algorithm selects a vector from a decision set after which it receives a noisy linear loss parameterized by an unknown vector. The goal in such a problem is to minimize the (pseudo) regret which is the difference between the total expected loss of the algorithm and the total expected loss of the best fixed vector in...

متن کامل

Stochastic Low-Rank Bandits

Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are un...

متن کامل

Structured Stochastic Linear Bandits (DRAFT)

In this paper, we consider the structured stochastic linear bandit problem which is a sequential decision making problem where at each round t the algorithm has to select a p-dimensional vector xt from a convex set after which it observes a loss `t(xt). We assume the loss is a linear function of the vector and an unknown parameter θ∗. We consider the problem when θ∗ is structured which we chara...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Automatic Control

سال: 2023

ISSN: ['0018-9286', '1558-2523', '2334-3303']

DOI: https://doi.org/10.1109/tac.2022.3179521